🎯 What We'll Cover
For most of this course, “using AI” has meant a chat window: you type, the model answers, you copy what is useful back into your own work. Claude Code is a different kind of thing, and the difference is the whole point of this track. It is an agent that works inside your real project folder — it reads your files, runs your code, edits your documents, uses version control, and can work on its own for long stretches.
This first lesson does three things: it says precisely what Claude Code is (and is not), using the model-versus-harness distinction from Week 10; it draws the sharp line between chatting and working with an agent, around a single principle — the chat is not the archive; and it sets up the question the rest of Lesson A answers: what does it actually take to drive one of these well?
Lesson B then turns to the payoff this track exists for: using Claude Code to make your research genuinely reproducible — inspectable and repeatable by someone who is not you.
🙏 Sources and thanks
This track’s reproducibility framework owes a large and explicit debt to Dominik Lukeš’s workshop Using AI Agents for Reproducible Research (Oxford e-Research Centre). The organising principle that the chat is not the archive, the model-versus-harness framing, the research-habits instruction file, and the “inspect a messy folder” first task are all adapted, with thanks, from that workshop and its accompanying skills. His materials are openly available: techczech.github.io/agents-for-reproducibility (the workshop guide) and github.com/techczech/dominiks-agent-skills (his agent-skills collection, MIT-licensed). The grill-with-docs glossary practice is from Matt Pocock (AI Hero). What this track adds on top — the pre-registration gates, the worked Berg River example, and the instructor’s own practice in the boxes below — builds on that foundation.
🧮 The Model Is the Smallest Part
Week 10.1 made an argument that is easy to nod along to and hard to feel until you see it: the harness is the product. The language model — Claude itself — is one component. What turns it into something useful is everything wrapped around it: the tools it can call, the files it can see, the commands it can run, the permissions it operates under, and the loop that lets it act, observe the result, and act again. In a chat window that harness is thin and invisible. In Claude Code it is thick, and it is the part you are actually driving.
Put concretely: when you ask Claude in a browser tab to help with an analysis, the model is essentially all you get. It can reason about what you paste in and write text back. When you ask Claude Code the same thing, the model is the smallest part of what goes to work. The harness gives it your actual data files, a shell to run a script, the ability to read the error message that script produced, version control to record what changed, and standing instructions about how your project works. The intelligence is similar; the leverage is not.
💡 The one-sentence version
You are not talking to a smarter chatbot. You are driving a different kind of machine — a model with hands, working in your project folder. Almost everything that follows in this track is about driving it well, and about the discipline that makes its work trustworthy afterwards.
🔧 What Claude Code Can Actually Do
Claude Code runs in your terminal and operates on a folder you point it at. Within that folder — and only within it, unless you say otherwise — it has a set of capabilities that are worth naming explicitly, because each one is a piece of the harness you will learn to use deliberately.
Reads and writes your files
It can open, read, and edit the actual documents, data files, and scripts in your project — not a copy you pasted, the real thing. This is powerful and is exactly why permission and raw-data rules matter (Lesson B).
Runs commands and code
It can execute shell commands and run your analysis scripts, then read the output or the error and respond to it. The act–observe–act loop is what makes it an agent rather than a text generator.
Uses version control
It can initialise Git, commit changes, show you diffs, and read the history. Used well, this turns into a reproducibility trace (Lesson B.3), not just a software habit.
Reads standing instructions
A CLAUDE.md file in the project is loaded at the start of every session. It is how you tell the agent the rules of your project once, instead of re-explaining every time. We meet it properly in A.3.
Runs reusable Skills
A Skill is a packaged, reusable workflow (a folder with a SKILL.md) the agent can invoke when it is relevant — a tested research procedure that travels across projects rather than being re-improvised each time (Lesson B.2).
Spawns subagents
For a bounded job it can launch a separate, focused agent — for example, an independent check of an analysis it just produced. This connects directly to the adversarial-verification idea from Week 9.
Connects to external tools (MCP)
Through the Model Context Protocol it can reach approved external services and data sources — a reference database, a paper repository — the same MCP idea introduced in Week 10.
Plans before it acts
It has a read-only plan mode: it inspects and proposes a plan without changing anything, so you can approve the approach before a single file moves. This is your most important safety control, and we use it in A.3.
None of these capabilities is exotic on its own — researchers have used shells, version control, and scripts for decades. What is new is that one system can use all of them in a loop, on your behalf, from a plain-language request. That is the capability Week 11.1 called “AI as a substantive collaborator,” made operational. And as Week 11.1 also insisted: the more the system can do unsupervised, the more the verification habit matters, not less.
📊 Why This Is Categorically Different From Chat
It is tempting to file Claude Code under “a more capable chatbot.” That framing will mislead you. The difference is not capability; it is where the work lives.
In a chat window
You paste context in. The model answers. The useful output, the reasoning, the decisions you made along the way — all of it lives in a conversation thread. Next week the thread is buried; next month you cannot reconstruct what you actually did, or why. The work is real but the record evaporates.
In Claude Code
The work lives in your files. The script it wrote is in scripts/. The output is in outputs/. The decision it made is in a log you told it to keep. The change is in the Git history. Six months later, you — or a stranger — can open the folder and see what happened.
📁 The chat is not the archive
This is the organising principle of the whole track, so it is worth stating once, plainly: save your sources, notes, instructions, scripts, outputs, and decisions into files — not into a chat thread. The conversation is where the work is commissioned; the project folder is where the work lives. A chatbot session is a conversation you will lose. A project folder is the unit of reproducible research.
Everything Lesson B builds — the folder discipline, the decision log, the reproducible analysis — is a consequence of taking this one sentence seriously.
⚖️ This is not “abandon chat”
Chat is still the right tool for a great deal: a quick question, a brainstorm, a one-off paragraph, thinking out loud. The honest distinction is about durability. Reach for chat when the value is in the answer you read right now. Reach for an agent in a project folder when the value is in work that has to survive, be repeated, or be defended later. Most researchers will use both, for different things, and knowing which is which is itself a skill.
📖 Download: the companion guide
The instructor has written a short guide, Claude Code as a Co-Scientist, that gathers this track’s territory into a single document — the model-versus-harness mental model, the human work an agent must never touch, the reproducibility conventions of Lesson B, and a reference for the research Skills that build on them. It is the natural companion to these two lessons and goes further than we can here. Many of its ideas — the reproducibility framework above all — come from Dominik Lukeš’s work, credited at the top of this lesson; the guide builds openly on that foundation and is shared in the same spirit. co-scientist-guide.pdf — and the seven research Skills it documents are available as a separate download in Lesson B.2.
Coming up in A.2: the honest case. Before you invest time learning to drive this, you deserve a straight account of what it costs, who it excludes, what you can approximate for free, and the genuine shift in how you work that it demands — the move from chatting to managing an agent.